Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: message-id in postprocessor/gelf-chunking #2662

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

BharatKJain
Copy link

Pull request

Description

Changed message-id from auto-increment ID to randomized ID in postprocessor/gelf_chunking.rs

HELP NEEDED: I have avoided adding hostname while producing message-id, I am not sure how can we handle adding hostname, please suggest.

Related

Checklist

  • The RFC, if required, has been submitted and approved
  • Any user-facing impact of the changes is reflected in docs.tremor.rs
  • The code is tested
  • Use of unsafe code is reasoned about in a comment
  • Update CHANGELOG.md appropriately, recording any changes, bug fixes, or other observable changes in behavior
  • The performance impact of the change is measured (see below)

Performance

Copy link

codecov bot commented Sep 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.22%. Comparing base (bff8093) to head (3e4f5a1).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2662   +/-   ##
=======================================
  Coverage   91.22%   91.22%           
=======================================
  Files         309      309           
  Lines       60078    60083    +5     
=======================================
+ Hits        54805    54812    +7     
+ Misses       5273     5271    -2     
Flag Coverage Δ
e2e-command 11.29% <0.00%> (-0.01%) ⬇️
e2e-integration 50.40% <0.00%> (+<0.01%) ⬆️
e2e-unit 12.57% <0.00%> (-0.01%) ⬇️
e2etests 52.73% <0.00%> (+<0.01%) ⬆️
tremorapi 14.51% <0.00%> (-0.01%) ⬇️
tremorcodec 62.66% <ø> (ø)
tremorcommon 63.04% <ø> (ø)
tremorconnectors 28.86% <0.00%> (-0.01%) ⬇️
tremorconnectorsaws 11.27% <0.00%> (-0.01%) ⬇️
tremorconnectorsazure 4.69% <0.00%> (-0.01%) ⬇️
tremorconnectorsgcp 25.31% <0.00%> (-0.01%) ⬇️
tremorconnectorsobjectstorage 0.06% <ø> (ø)
tremorconnectorsotel 12.58% <0.00%> (-0.01%) ⬇️
tremorconnectorstesthelpers 68.25% <ø> (ø)
tremorinflux 87.71% <ø> (ø)
tremorinterceptor 54.33% <100.00%> (+0.04%) ⬆️
tremorpipeline 31.17% <ø> (ø)
tremorruntime 47.25% <0.00%> (+<0.01%) ⬆️
tremorscript 55.12% <ø> (ø)
tremorsystem 5.78% <ø> (ø)
tremorvalue 69.55% <ø> (ø)
unittests 89.08% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...mor-interceptor/src/postprocessor/gelf_chunking.rs 92.47% <100.00%> (+0.42%) ⬆️

... and 3 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bff8093...3e4f5a1. Read the comment docs.

Copy link
Member

@Licenser Licenser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 looks reasonable nothing to prevent using this I see, great documentation too :)

two things I notice.

  1. it would be nice to mention how message id's are generated in the docs (the //! section)

  2. if you are up for a challenge: we try to keep all time and randomness out of tremor to allow for deterministic replays. We do this by using ingest_ns for as a random seed, and for times, that way a event that is logged with it's ingest ns can be replayed and generate the exact same out put. The random function is a good example. It'd be interesting to see this same concept re-used for this to allow repeatable yet still random message id's; one way would be to use ingest-ns instead of the current epoch (which would be nice anyway as looking uo time isn't fast), and then seed the RNG somehow (probably not with the ingest ns as that would make it useless) but perhaps with the fist n bytes of the message? or with some bytes of a hash of the message?

@BharatKJain
Copy link
Author

  1. Okay, will do.

  2. Trying to think-out-loud, so let's say if we create a hash of a message but repetitive logs will create same message-id which is a problem because message-id has to be unique in nature, collisions can cause problems when we're decoding the message on the server side. I am not completely sure but ideally we would want to have uniqueness in the message-id to make sure that we are not breaking the server GELF-decoding.

(How it will break server-side due to collision? So message-id is a way of determining if the UDP packet is associated with already existing log or it's for a new log, when we're sending same message-id for multiple logs then server behaviour will be to merge the data-together which will end-up breaking the log)

TBH I am also trying to figure this out, please share any suggestions, am I thinking right? 😅

@Licenser
Copy link
Member

Licenser commented Oct 3, 2024

Ja just the message content would not work, I'm still considering if message content + ingest_ns (nanosecond when the message was registered at tremor) would be enough, if a server produces the same log twice in the same nanosecond that'd be very odd (but not impossible) OTOH having two random generated numbers be the same is also odd (but not impossible) it would also one a more deterministic failure case "When messages with the same content arrive at exactly the same time they will get duplicated message ids" instead of "if the RNG hates you, you'll get duplicated message ids"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants